智能论文笔记

时间序列预测是重要的应用领域的核心，对机器学习算法构成了重大挑战。最近，神经网络体系结构已广泛应用于时间序列的预测问题。这些模型中的大多数都是通过最大程度地减少损失函数来衡量预测偏离实际值的训练的。典型的损耗函数包括均方根误差（MSE）和平均绝对误差（MAE）。在存在噪声和不确定性的情况下，神经网络模型倾向于复制时间序列的最后观察值，从而限制了它们对现实数据的适用性。在本文中，我们提供了上述问题的形式定义，还提供了观察到问题的预测的一些示例。我们还提出了一个正规化项，对先前看到的值的复制进行了惩罚。我们在合成数据集和现实世界数据集上评估了拟议的正规化项。我们的结果表明，正则化项会在一定程度上缓解上述问题，并产生更健壮的模型。

translated by 谷歌翻译

我们介绍了IST和Unmabel对WMT 2022关于质量估计（QE）的共享任务的共同贡献。我们的团队参与了所有三个子任务：（i）句子和单词级质量预测；（ii）可解释的量化宽松；（iii）关键错误检测。对于所有任务，我们在彗星框架之上构建，将其与OpenKIWI的预测估计架构连接，并为其配备单词级序列标记器和解释提取器。我们的结果表明，在预处理过程中合并参考可以改善下游任务上多种语言对的性能，并且通过句子和单词级别的目标共同培训可以进一步提高。此外，将注意力和梯度信息结合在一起被证明是提取句子级量化量化宽松模型的良好解释的首要策略。总体而言，我们的意见书在几乎所有语言对的所有三个任务中都取得了最佳的结果。

translated by 谷歌翻译

Disentangling Uncertainty in Machine Translation Evaluation

Chrysoula Zerva , Taisiya Glushkova , Ricardo Rei , André F. T. Martins

分类：自然语言处理

2022-04-13

Trainable evaluation metrics for machine translation (MT) exhibit strong correlation with human judgements, but they are often hard to interpret and might produce unreliable scores under noisy or out-of-domain data. Recent work has attempted to mitigate this with simple uncertainty quantification techniques (Monte Carlo dropout and deep ensembles), however these techniques (as we show) are limited in several ways -- for example, they are unable to distinguish between different kinds of uncertainty, and they are time and memory consuming. In this paper, we propose more powerful and efficient uncertainty predictors for MT evaluation, and we assess their ability to target different sources of aleatoric and epistemic uncertainty. To this end, we develop and compare training objectives for the COMET metric to enhance it with an uncertainty prediction output, including heteroscedastic regression, divergence minimization, and direct uncertainty prediction. Our experiments show improved results on uncertainty prediction for the WMT metrics task datasets, with a substantial reduction in computational costs. Moreover, they demonstrate the ability of these predictors to address specific uncertainty causes in MT evaluation, such as low quality references and out-of-domain data.

translated by 谷歌翻译